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Introduction. 


LINE1  is  the  most  abundant  active  human  retroelement,  and  it  contributes  to  genomic 
instability.  The  full  extent  of  LINE- 1  mobility  in  somatic  tissues  and  tumors  is  not  known. 
LI  expression  is  extremely  low  in  differentiated  cells  except  for  testis,  but  it  is 
significantly  elevated  in  breast  malignancies.  These  observations  suggest  that 
posttranscriptional  mechanisms  are  involved  in  limitation  of  LI  expression.  We 
demonstrated  that  the  use  of  the  polyA  sites  located  within  the  LI. 3  genome  limits  the 
amount  of  full-length  LI  .3  mRNAs  present  in  mammalian  cells.  Internal  LI  polyA  signals 
can  also  be  functional  when  fragments  of  LI. 3  elements  are  inserted  into  3’  untranslated 
regions  of  mammalian  genes.  This  unique  attenuation  mechanism  helps  to  minimize  the 
rate  of  LI  retrotransposition,  but  may  also  increase  the  negative  impact  of  these  insertion 
events  on  the  genome  after  their  insertion.  Human  EST  database  searches  suggest  that 
these  internal  polyA  signals  may  also  play  a  role  in  regulation  of  LI  expression  in  a  tissue 
and/or  tumor  specific  manner  with  breast  cancer  tissues  supporting  the  least  efficient 
internal  LI  polyadenylation.  The  EST  data  are  strengthened  by  significant  differences  in 
the  LI  RNA  profiles  between  breast  cancer  and  nonbreast  cancer  cell  lines  detected  by 
northern  blotting  of  transiently  transfected  human  LI. 3.  These  observations  suggest  a 
potential  global  change  in  the  mechanism  of  polyadenylation  process  upon  malignant 
transformation  of  mammary  gland. 
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Body. 


According  to  the  approved  Statement  of  Work  we  have  made  the  following  progress  in  the 
second  year  of  funding: 

Task  1.  To  identify  functional  polyA  sites  in  the  LI. 3  genome. 

A-B.  We  developed  a  sensitive  northern  blotting  assay  that  allowed  detection  of  truncated 
LI  related  RNA  species  produced  from  a  construct  containing  a  full-length  LI. 3.  The 
result  was  consistent  with  premature  internal  polyadenylation.  We  confirmed  that  multiple 
predicted  canonical  and  noncanonical  internal  polyA  signals  identified  within  LI. 3  coding 
region  are  functional  by  site-directed  mutagenesis,  3 ’RACE,  and  human  and  mouse  EST 
database  analysis  (Perepelitsa-Belancio  and  Deininger,  Nature  Genetics,  2003,  reprint  of 
the  publication  is  attached). 

The  experimental  design  described  in  this  portion  of  the  original  proposal  utilized  a 
reporter  system  with  an  intron  located  upstream  of  the  Renilla  luciferase  (Rluc)  gene 
(Fig.  1).  We  created  a  series  of  constructs  containing  fragments  of  the  LI  .3  genome  with 
the  strongest  predicted  polyA  signals  in  the  above-described  intron  (opiluc  constructs).  As 
stated  in  the  previous  Annual  Summary  Report,  we  rendered  this  approach  unfit  for  our 
study  due  to  the  competition  between  splicing  and  polyadenylation  (1).  We  subsequently 
developed  a  method  for  detection  of  functional  polyadenylation  signals  within  the  entire 
LI. 3  genome  (2).  Discovery  of  functional  polyA  sites  located  throughout  the  LI. 3  coding 
region  led  us  to  revisit  this  reporter  gene  system.  It  can  address  an  important  question 
relevant  to  the  impact  that  LI  insertions  have  on  the  mammalian  genomes.  The  question  is: 
can  internal  LI  polyA  signals  be  utilized  upon  element’s  insertion  into  an  intron  of  a 
mammalian  gene  and  attenuate  its  expression?  In  our  previous  work,  in  order  to  detecte 
functional  polyA  sites  located  in  the  intron  of  the  reporter  gene,  we  relied  on  reduction  in 
the  Rluc  activity.  However,  if  the  polyA  sites  are  used  inefficiently  due  to  the  competition 
with  splicing,  the  changes  in  the  enzymatic  activity  may  fall  within  experimental  error  of 
the  assay.  Therefore  such  subtle  differences  would  not  be  detectable  in  the  transient 
transfections,  yet  relevant  in  the  adequate  biological  systems.  For  example,  LI  insertions 
into  the  introns  and  3’  untranslated  regions  (UTR)  of  mammalian  genes  are  less 
detrimental  to  the  genomes.  As  a  result,  a  number  of  LI  fragments  are  sometimes  found 
within  introns.  Therefore,  for  a  gene  containing  multiple  introns  (which  is  often  the  case 
for  mammalian  genes)  a  combined  impact  of  premature  termination  may  significantly 
impair  protein  production.  We  are  currently  working  out  the  conditions  for  a  3’  RACE 
analysis  of  RNA  species  produced  by  opiluc  constructs  upon  transient  transfections  in  NIH 
3T3  cells. 

C-F.  We  previously  reported  that  insertions  of  fragments  of  LI. 3  genome  into  the  3’  UTR 
of  a  reporter  gene  (FRM/N  series  of  constructs,  Fig.  2)  significantly  decreased  the  total 
amount  of  produced  mRNA  (Fig.  3  page  14  of  appendices  Annual  Summary  Report 
2003).  This  result  was  inconclusive  for  identifying  whether  putative  LI. 3  polyA  sites  were 
functional  because  (i)  no  slower  migrating  bands  were  detected  on  the  northern  blots  and 
(ii)  reduction  of  total  mRNA  could  have  resulted  from  decreased  stability  of  the  mRNAs 
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containing  LI  .3  fragments.  Alternatively,  the  result  may  be  a  combination  of  the  two.  At 
the  time  we  were  interested  in  developing  an  adequate  approach  to  detecting  functional 
polyA  signals  in  the  system  that  would  be  the  most  biologically  relevant  to  the  LI. 3  life 
cycle  (see  above).  However,  with  the  knowledge  we  gained  by  identifying  functional 
internal  LI. 3  polyA  sites  in  the  context  of  the  full-length  LI. 3  element,  we  were  compelled 
to  revisit  the  FRM/N  system.  This  system  can  provide  an  answer  to  the  important  question 
of  whether  LI  polyA  signals  can  be  utilized  when  LI  fragments  are  inserted  into  the  3’ 
UTRs  of  mammalian  genes,  a  phenomenon  often  found  in  mammalian  genomes.  We  used 
a  sensitive  3’  RACE  analysis  to  detect  any  truncated  RNA  species  resulted  from  utilization 
of  the  internal  LI. 3  polyA  signals  from  FRM/N  our  control  vector  and  FRM/N#2,  a 
construct  that  contains  a  1  kb.  LI. 3  fragment  cloned  into  the  3’UTR  of  FRM/N  (Fig.  3). 
The  choice  was  dictated  by  the  observation  that  northern  blotting  assay  may  not  be 
sensitive  enough  to  detect  truncated  products.  As  expected,  we  detected  multiple  truncated 
bands  (250  bp.  to  1.2  kb.)  that  in  size  corresponded  to  the  positions  of  the  polyA  signals  in 
LI. 3  fragment.  Sequence  analysis  of  some  of  the  truncated  bands  revealed  the  usage  of 
noncanonical  LI. 3  polyA  sites.  This  result  also  provides  a  direct  explanation  of  the 
observation  that  majority  of  the  LI  insertions  identified  in  the  human  and  mouse  genes  are 
in  the  reverse  orientation  (3). 

We  also  removed  a  very  strong  SV40  polyadenylation  signal  present  in  the  FRM/N 
and  FRM/N#2  vectors.  This  approach  is  designed  to  test  whether  the  utilization  of  internal 
LI  .3  polyadenylation  sites  varies  depending  on  the  strength  of  the  polyA  signal  at  the  end 
of  a  gene.  The  rational  for  this  experiment  is  based  on  the  reported  competition  between 
polyadenylation  signals  depending  on  their  relative  strength  and  position  0-  We  are 
currently  working  out  optimal  conditions  for  the  3 ’RACE  analysis  of  FRM/NASV40  and 
FRM/N#2ASV40  vectors.  We  do  not  expect  to  obtain  any  quantitative  data  from  this 
experiment  for  3’RACE  is  not  a  quantitative  assay.  However,  it  will  give  us  a  good 
understanding  of  the  influence  of  genomic  sequences  surrounding  an  LI  fragment  on  the 
utilization  of  the  polyA  sites  it  encodes.  We  expect  that  there  may  be  a  difference  in  the 
band  pattern  produced  from  FRM/N#2ASV40  and  FRM/N#2  vectors. 

The  data  produced  by  the  experiments  described  in  A-B  and  C-F  will  be  combined 
for  a  publication  with  the  bioinformatic  data  demonstrating  the  frequency  of  the  LI 
insertions  in  the  introns  and  3’  UTRs  of  human  genes. 

Task  2.  To  identify  the  function  and  relative  strength  of  functional  polyA  sites  in  normal 
and  breast  cancer  cells. 

We  have  compared  the  LI. 3  RNA  profiles  in  mouse  fibroblast  NIH  3T3  cell  line 
and  in  aggressive  human  breast  cancer  cell  line,  SK-Br-3,  that  overexpresses  ErbB3 
receptor.  Northern  blot  analysis  with  the  strand-specific  5’-UTR  probe  demonstrated 
significant  differences  in  the  utilization  of  the  internal  polyadenylation  sites  between  these 
two  cell  lines  (Fig.  4).  SK-Br-3  breast  cancer  cell  line  does  not  support  efficient  internal 
polyA  site  usage  making  the  full-length  LI. 3  mRNA  be  the  major  product  in  this  cell  type. 
In  contrast,  only  2-4%  of  the  total  LI  related  products  correspond  to  the  full-length  LI, 
when  the  same  construct  is  transiently  transfected  in  NIH  3T3,  HeLa,  or  chicken 
fibroblasts.  This  result  strongly  indicates  that  there  is  a  significant  difference  in  the  usage 


of  the  internal  polyA  signals  during  LI  transcription  in  breast  cancer  vs.  fibroblast  cell 
lines. 

These  data  are  consistent  with  our  preliminary  bioinformatic  results.  We  used  100 
bp.  fragments  of  LI. 3  as  BLAST  query  sequences  to  search  existing  human  EST 
databases.  The  rationale  behind  this  approach  was  to  determine  whether  there  is  a  tissue- 
and/or  cancer-specific  difference  in  the  usage  of  internal  polyA  sites  by  active  endogenous 
human  LI  elements.  We  found  out  that  proportion  of  LI  related  3’end  ESTs  varied 
significantly  among  different  human  tissues  (Table  1).  The  distribution  of  the  LI  3’ ends  in 
lung,  stomach,  and  retina  were  consistent  with  the  LI  profiles  detected  by  northern  blot 
analysis  in  mouse  fibroblast.  On  the  contrary,  the  pattern  of  the  LI  related  3’  ends  in 
placenta,  testes,  ovary,  and  brain  differed  from  that  detected  in  other  tissues  and  by 
northern  blotting  in  NIH  3T3  cells.  We  also  tested  whether  there  is  a  variation  in  the  polyA 
site  usage  between  normal  and  cancer  tissues  of  different  human  organs  (Table  2).  We 
detected  significant  difference  in  the  proportions  of  the  LI  related  3 ’end  ESTs  in  normal 
vs.  tumor  tissues  of  brain  and  ovary.  Although  we  found  no  normal  breast  LI  sequences 
for  comparison,  the  majority  of  LI  elements  found  in  breast  cancer  were  full-length. 
Additionally,  the  proportion  of  ESTs  corresponding  to  the  full-length  LI  element  differed 
significantly,  4  to  7  fold,  between  some  normal  and  tumor  tissues  supporting  a  potential 
increase  in  LI  retrotransposition  rate  upon  malignant  transformation  The  EST  data  from 
mammary  gland,  even  though  very  limited,  correlates  with  the  profile  of  LI  internal 
polyadenylation  detected  by  northern  blotting  in  SK-Br-3  breast  cancer  cell  line.  More 
detailed  analysis  of  the  human  and  mouse  EST  database  will  be  required  to  strengthen 
these  preliminary  results.  Unfortunately,  this  kind  of  analysis  is  beyond  our  technical 
capability.  Our  data  suggest  that  internal  LI  polyA  sites  are  differentially  utilized  in  human 
tissues  and  upon  malignant  transformation  of  breast,  brain,  and  ovary.  The  data  indicate 
that  internal  polyadenylation  may  play  a  role  in  regulation  of  the  amount  of  the  full-length 
LI  element  and  therefore  retrotransposition. 

We  will  continue  to  explore  LI  polyadenylation  profiles  in  other  breast  cancer  cell 

lines. 

We  will  also  expend  the  experiments  described  in  Task  1  (C-F)  by  using  FRM/N- 
based  constructs  in  different  breast  cancer  cell  lines. 

Task  3.  To  identify  the  effect  of  functional  polyA  sites  on  the  rate  of  LI  retrotransposition. 

We  have  shown  that  mutation  of  functional  internal  LI  .3  polyA  signals  results  in 
the  more  efficient  utilization  of  the  polyA  sites  located  in  the  vicinity  (Perepelitsa-Belancio 
and  Deininger,  2003).  Removal  of  five  functional  LI. 3  internal  polyadenylation  signals 
lead  to  a  modest  but  statistically  significant  increase  in  the  rate  of  retrotransposition  in 
HeLa  cells.  We  will  next  determine  the  rate  of  LI. 3  retrotransposition  in  SK-Br-3  cell  line 
and  other  breast  cancer  cell  lines. 


7 


Key  Research  Accomplishments. 


Task  /.To  identify  functional  polyA  sites  in  LI. 3  genome  (Months  1-25) 

Task  2.  To  identify  the  function  and  relative  strength  of  functional  polyA  sites  in  normal 
and  breast  cancer  cells  (Months  25-29). 

Task  3a.  To  identify  the  effect  of  functional  polyA  sites  on  the  rate  of  LI  retrotransposition 
(Months  29-36) 


•  We  determined  that  putative  canonical  and  noncanonical  polyA  sites  found  in  the 
coding  region  of  the  human  LI. 3  element  are  functional. 

•  We  demonstrated  that  removal  of  five  functional  polyA  signals  resulted  in  a 
statistically  significant  increase  in  the  rate  of  LI  retrotransposition  in  HeLa  cells. 

•  We  demonstrate  by  northern  blots  of  transiently  transfected  breast  cancer  cells  that 
there  is  a  much  less  efficient  utilization  of  the  internal  LI  polyA  signals  in  SK-Br-3 
breast  cancer  cell  line. 

•  We  performed  human  EST  database  search  to  determine  whether  there  are  any 
differences  in  the  usage  of  internal  polyA  signal  in  breast  tissues. 

•  We  performed  human  EST  database  search  to  determine  whether  there  are  any 
differences  in  the  usage  of  internal  polyA  signal  between  normal  human  tissues 
human  breast  cancers. 

•  We  performed  a  3’  RACE  analysis  to  determine  functional  polyA  signals  in  LI 
fragment  located  in  the  3’  UTR  of  the  reporter  gene. 
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Reportable  outcomes: 


1.  An  appended  manuscript  “RNA  truncation  by  premature  polyadenylation  attenuates 
human  mobile  element  activity"  has  been  published  in  journal  of  Nature  Genetics  Volume 
35,  Number  4,  December  2003. 

2.  Responses  to  the  publication:  Press  release  “The  silence  of  the  LINEs”,  “Silent  garbage” 

3.  Appended  is  an  abstract  “RNA  truncation  by  premature  polyadenylation  attenuates 
human  mobile  element  activity"  of  the  presentation  that  took  place  at  the  annual  Molecular 
and  Cellular  biology  Program  Retreat,  Tulane  University,  New  Orleans,  LA  in  October, 
2003. 

4.  A  Ph.D.  in  Molecular  and  Cellular  Biology  from  Tulane  University 

5. 17th  Morris  F.  Shaffer  and  Margaret  H.D.  Smith-Shaffer  Award  for  Excellence  in 
Research,  received  in  May,  2004. 

6.  Patent  application  60/445,945  (extension  filed  on  February  7, 2004). 

Deininger,  Prescott  L.  and  Victoria  Perepelitsa  Belancio 
Entitled:  Mammalian  Retrotransposable  Elements. 
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Conclusions: 


1 .  We  demonstrated  that  both  canonical  and  noncanonical  polyA  sites  found  in  the 
coding  region  of  the  human  LI. 3  element  are  functional,  and  they  attenuate  the 
amount  of  the  full-length  mRNA  and  the  rate  of  retrotransposition  in  HeLa  cells. 

2.  Our  EST  data  and  transient  transfections  of  LI. 3  expression  cassette  of  human 
breast  cancer  cell  line  suggest  that  the  process  of  internal  polyadenylation  may  be 
involved  in  regulation  of  the  amount  of  the  full-length  LI  mRNA  in  a  tissue 
specific  manner  and  upon  malignant  transformation. 

3.  Our  3’  RACE  analysis  demonstrated  that  internal  LI  polyadenylation  sites  are  used 
during  transcription  of  a  luciferase  reporter  gene  when  the  LI  fragment  is  cloned 
into  the  3’  UTR  of  the  gene.  This  finding  has  significant  implications  for  our 
understanding  of  mammalian  gene  expression. 
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Figure  1.  The  opiluc  reporter  construct.  We  have  built  this  reporter  vector,  which 
expresses  the  Renilla  luciferase  gene  from  the  RSV  promoter.  The  vector  has  a  multilinker 
in  an  intron  region  labeled  Xhol  which  is  the  primary  site  used  for  cloning  of  the  polyA 
sites.  Below  it  is  the  segment  of  the  LI. 3  sequence  that  we  have  subcloned.  The  vertical 
lines  with  numbers  represent  predications  of  the  relative  strengths  of  the  poly  A  sites  using 
the  polyadq  program. 
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Figure  2.  FR  Mlu/Not  linker  constructs.  1  kb  fragments  of  the  LI. 3  genome  were  cloned 
into  the  3’  UTR  of  the  Firefly (F)  luciferase  reporter  gene.  Vertical  black  lines  represent 
functional  polyA  signals  identified  in  the  LI. 3  genome.  FR  M/N  construct  and  its  mRNA 
products  are  shown  on  the  left  side  of  the  panel.  FRM/N#2  construct  and  its  potential 
mRNA  products  (if  the  LI. 3  polyA  sites  are  used  during  transcription)  are  shown  on  the 
right  side  of  the  panel.  A  red  arrow  indicates  the  position  of  the  upstream  Fluciferase- 
specific  primer  used  in  the  3’RACE  analysis. 


Figure  3.  3’RACE  analysis  of  the  FR  and  FR#2  constructs  with  the  Firefly  luciferase 
specific  primer.  NIH  3T3  cells  were  transfected  with  FR  or  FR#2  expression  cassettes. 
Total  RNA  was  harvested  24  h.  post  transfection,  polyA  selected  and  subjected  to  the 
3’RACE  analysis.  The  size  of  the  band  detected  for  FR  construct  is  consistent  with  the 
usage  of  the  SV40  polyadenylation  signal.  This  was  also  confirmed  by  sequence  analysis 
of  the  DNA  Multiple  bands  observed  for  FR#2  construct  are  consistent  with  the  usage  of 
polyA  sites  located  in  the  LI  .3  fragment  cloned  in  the  3’  UTR  of  the  luciferase  reporter 
gene. 


Figure  4.  Profiles  of  wt  LI  internal  polyadenylation  in  mouse  fibroblasts  and  human  breast 
cancer  cell  lines.  NIH  3T3  and  SK-Br-3  cells  were  transiently  transfected  with  the  wt  LI. 3 
expression  cassette.  Total  RNA  was  harvested  24  hr.  post  transfection,  polyA  selected, 
fractionated  by  electrophoresis,  and  transferred  onto  the  nylon  membrane.  Northern  blot 
analysis  was  carried  out  with  the  strand-specific  5’UTR  probe.  3T3L1  and  SKL1  are  NIH 
3T3  and  SK-Br-3  cells  transfected  with  the  LI. 3  expression  cassette,  3T3  and  SK  are 
nontransfected  NIH  3T3  and  SK-Br-3  cells  respectively. 


Table  1.  Percentage  of  human  LI  ESTs  in  different  tissues.  ESTs  were  identified  using 
BLAST  and  the  LI  consensus  query  region  shown  in  the  left  column.  Fractions  of  ESTs 
identified  in  the  various  bins  are  shown,  along  with  the  total  number  of  transcripts  studied 
for  each  tissue  type.  The  highest  fraction  for  each  tissue  is  highlighted  in  red,  with  those  of 
secondary  abundance  shown  in  green.  Yellow  marks  the  proportion  of  full-length 
transcripts. 
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Table  2.  Percentage  of  human  LI  ESTs  in  normal  vs.  tumor  tissues.  ESTs  were  identified 
using  BLAST  and  the  LI  consensus  query  region  shown  in  the  left  column.  Fractions  of 
ESTs  identified  in  the  various  bins  are  shown,  along  with  the  total  number  of  transcripts 
studied  for  each  tissue  type.  The  highest  fraction  for  each  tissue  is  highlighted  in  red,  with 
those  of  secondary  abundance  shown  in  green.  Yellow  marks  the  proportion  of  full-length 
transcripts. 


RNA  TRUNCATION  BY  PREMATURE  POLYADENYLATION  ATTENUATES 
HUMAN  MOBILE  ELEMENT  ACTIVITY. 


Victoria  P.  Belancio  and  Prescott  Deininger 

LINE1  is  the  most  prevalent  human  retroelement,  and  it  contributes  to  genomic  instability. 
LINE1  elements  comprise  17%  of  the  human  genome,  which  translates  into  about  5x10s 
LI  copies,  the  majority  of  which  are  truncated  at  their  5’  end.  The  full  extent  of  LINE- 1 
mobility  in  somatic  tissues  and  particularly  in  tumors  is  still  not  known.  LINE-1  is  the  only 
member  of  the  autonomous  non-LTR  retrotransposones  family  that  is  currently  active.  It  is 
believed  that  about  60  active  copies  of  full-length  LINE-1  are  present  in  the  human 
genome. 

LI  expression  is  extremely  low  in  all  cell  types  of  a  mature  organism  except  for  testis.  In 
contrast,  significantly  higher  levels  of  LINE- 1  expression  were  found  in  various  cancer 
cells.  The  known  factors  involved  in  regulation  of  LINE  1  expression,  such  as 
hypomethylation  often  associated  with  malignant  transformation  and  promoter  activity, 
cannot  fully  explain  the  observed  pattern  of  expression.  In  addition,  full-length  LINE-1  is 
not  detected  by  Northern  blotting  in  cell  culture  even  when  transiently  expressed  from  the 
CMV  promoter.  Together,  these  observations  suggest  that  posttranscriptional  mechanisms 
might  be  involved  in  regulation  or  limitation  of  LI  expression.  By  using  a  polyadq 
program,  we  identified  20  putative  polyadenylation  (polyA)  sites  located  only  in  the  sense 
strand  of  the  LI. 3  genome.  We  hypothesize  that  the  use  of  the  putative  polyA  sites  located 
within  the  LI. 3  genome  and  RNA  instability  limit  the  amount  of  full-length  LI. 3  mRNAs 
present  in  mammalian  cells. 

To  address  this  question  we  developed  a  sensitive  northern  blot  assay  that  allows  detection 
of  the  full-length  LI  mRNA  as  well  as  any  RNA  species  produced  through  internal 
polyadenylation.  We  determined  that  the  A-rich  coding  strand  of  the  LI  elements  contains 
numerous  internal  polyadenylation  sites  that  attenuate  full-length  LI  RNA  formation  by 
about  50  fold.  There  is  tremendous  redundancy  in  these  internal  polyadenylation  sites,  and 
their  presence  is  conserved  throughout  mammalian  LI  elements.  This  unique  attenuation 
mechanism  helps  to  minimize  the  rate  of  LI  retrotransposition,  but  may  also  increase  the 
negative  impact  of  these  insertion  events  on  the  genome. 
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RNA  truncation  by  premature  polyadenylation  attenuates 
human  mobile  element  activity 

Victoria  Perepelitsa-Belancio  &  Prescott  Deininger 


Long  interspersed  elements  (LINE-1  s,  also  called  Lis)  are  the 
only  active  members  of  the  autonomous,  non-long  terminal 
repeat  (LTR)  retrotransposon  family,  which  reshapes 
mammalian  genomes  in  many  different  ways1"5.  LINE-1 
expression  is  low  in  most  differentiated  cells  but  high  in  some 
cancer  cells,  in  testis  and  during  embryonic  development6-12. 

To  minimize  the  negative  impact  on  their  hosts'  genomes,  many 
mobile  elements  strategically  limit  their  amplification  potential, 
particularly  in  somatic  cells13-15.  Here  we  show  that  the  A-rich 
coding  strand  of  the  human  LINE-1  contains  multiple  functional 
canonical  and  noncanonical  polyadenylation  (poly(A))  signals, 
resulting  in  truncation  of  full-length  transcripts  by  premature 
polyadenylation.  This  attenuation  lowers  the  rate  of 
retrotransposition  in  assays  using  HeLa  cells.  It  probably  also 
increases  the  negative  effects  of  LINE-1  insertions  into  genes16. 

Analysis  of  the  LINE- 1.3  sequence  with  the  POLYADQ17  program 
identified  19  potential  poly  (A)  signals,  AATAAA  and  ATTAAA,  in  the 
sense  orientation  but  only  2  in  the  antisense  strand  of  the  human 
LINE-1  element  (Supplementary  Table  1  online).  Several  sites  are  pre¬ 
dicted  to  be  much  stronger  than  the  relatively  weak  poly(  A)  site  found 
at  the  3'  end  of  the  LINE-1  element18.  Gorilla  and  mouse  elements 
showed  a  similar  enrichment  in  the  sense  strand  (Fig.  1).  Additionally, 
there  are  141  noncanonical  poly(A)  sites  distributed  throughout  the 
LINE- 1.3  genome,  deviating  from  the  consensus  by  only  one  base 
(Supplementary  Table  1  online).  The  widespread  presence  of  these 
poly(A)  signals  suggests  that  they  have  a  conserved  function,  perhaps 
limiting,  or  regulating,  LINE- 1  retrotransposition. 

By  transiently  expressing  an  active  human  LINE- 1.3  (ref.  19)  ele¬ 
ment  in  mouse  NIH  3T3  cells,  we  detected  two  poly(A),  high-molecu¬ 
lar-weight,  LINE- 1.3 -specific  bands  that  migrated  between  7.4  and  9.4 
kb  (Fig.  2a).  The  doublet  was  observed  with  probes  to  both  the  5'  UTR 
and  the  neomycin  marker  at  the  3'  end  of  the  LINE- 1.3  construct  (Fig. 
2b),  indicating  that  these  RNAs  represent  full-length  transcripts.  The 
presence  of  the  doublet  is  consistent  with  the  inefficient  splicing  of  the 
intron20  (Fig.  2b).  Multiple,  faster-migrating  species  were  also  present 
and  much  more  abundant  than  the  full-length  RNAs.  The  sizes  of 
these  transcripts  roughly  corresponded  to  the  positions  expected  with 
use  of  the  putative  poly(A)  sites  identified  in  the  LINE- 1.3  sequence 
(Fig.  2a).  The  most  intense  band,  band  3,  correlated  with  the  strongest 


predicted  poIy(A)  site.  The  bands  were  absent  from  the  flow-through 
fraction  of  the  oligo-dT  selection  and  were  not  detected  by  the 
neomycin  strand-specific  probe  located  downstream  of  these  termina¬ 
tion  sites  (data  not  shown). 

In  NIH3T3  cells  transfected  with  the  LINE- 1.3  expression  vector 
lacking  the  intron- containing  neoT  tag,  the  pattern  of  the  truncated 
bands  was  identical  to  that  detected  with  the  5'  UTR  probe  (Fig.  2a). 
Although  there  was  a  higher  proportion  of  full-length  transcripts,  sug¬ 
gesting  that  the  neoT  tag  interferes  somewhat  with  full-length  RNA 
production,  these  data  indicate  that  the  premature  polyadenylation  is 
not  an  artifact  of  the  splicing  introduced  in  the  reporter  system.  We 
therefore  used  the  LINE-1.3neor  expression  vector  for  future  experi¬ 
ments  to  correlate  RNA  profiles  with  the  rate  of  retrotransposition. 

Northern  blotting  of  RNAs  from  NIH3T3  cells  transfected  with  a 
mouse  element,  LINE- 1  spa21*22  (Fig.  2a),  detected  abundant  trun¬ 
cated  bands  corresponding  in  size  to  the  positions  of  the  putative 
poly(A)  signals  in  this  element.  This  shows  that  premature 
polyadenylation  is  conserved  in  mammalian  LINE-1  elements.  The 
presence  of  these  polyadenylation  sites  correlates  with  the  high 
(-40%)  A-residue  content  in  the  LINE-1  coding  region. 

Two  inactivating  point  mutations  introduced  into  the  hexamer  of  the 
strongest  predicted  poly(A)  site23  (mutant  1)  resulted  in  the  complete 
loss  of  the  wild-type  band  3  in  NIH3T3  cells  (Fig.  2a)  and  more  effi¬ 
cient  use  of  the  nearby  poly(A)  sites  (Fig.  2a).  The  amount  of  full- 
length  LINE- 1.3  mRNA  (Fig.  2a)  was  similar  in  the  wild  type  and 
mutant  1  (Supplementary  Table  2  online).  The  RNA  profile  of  LINE- 
1.3  with  mutations  in  canonical  poly(A)  sites  that  were  expected  to  pro¬ 
duce  bands  4  and  5  in  the  background  of  mutant  1  (mutant  3)  was 
similar  to  the  band  pattern  of  mutant  1  (Supplementary  Fig.  1  online). 
This  suggested  that  use  of  some  of  the  multiple  noncanonical  poly(A) 
signals  in  the  region  compensates  for  the  removal  of  the  classical  sites. 
Inactivation  of  the  noncanonical  poly(A)  signals  at  positions  2,053  or 
2,079  in  the  background  of  mutant  3  did  not  significantly  change  the 
intensity  of  band  4  (Supplementary  Fig.  1  online  and  data  not  shown). 
But  band  4  disappeared  in  the  mutant  lacking  both  of  these  poly(A) 
sites  in  the  mutant  3  background  (mutant  5)  (Fig.  2a).  Removal  of 
these  poly(A)  signals  shifted  the  RNA  intensities  to  higher-molecular- 
weight  bands,  indicating  more  efficient  use  of  the  downstream  sites,  as 
judged  by  the  intensity  of  band  2  (Fig.  2a  and  Supplementary  Table  2 
online).  Thus,  internal  poly(A)  sites  present  in  the  LINE- 1.3  genome 
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Figure  1  Putative  po!y(A)  sites  in  the  human  (LINE1.3),  gorilla  (LINE-IGg-lA) 
and  mouse  (LINE-lspa)  LINE-1  elements.  The  genomes  were  aligned 
according  to  the  beginning  of  the  open  reading  frame  (ORF)  1.  An  asterisk 
marks  the  strongest  predicted  poly(A)  site.  Below  the  nucleotide  scale  is  a 
schematic  of  the  LINE-1  retrotransposition  cassette29.  The  approximate 
positions  of  the  endogenous  LINE-1  promoter  (PRO),  coding  regions  (0RF1 
and  0RF2),  3'  UTR  and  poly(A)  site  (Ll,3pA)t  intron  (IN)  and  inverted 
neomycin  resistance  gene  (large  arrow  pointing  to  the  left  containing  the  intron 
sequence)  are  shown.  The  SV40  polyadenylation  site  (SV40pA)  is  located 
immediately  downstream  of  the  LINE-1  cassette.  Horizontal  lines  represent  the 
predicted  mRNA  species  with  the  thickness  of  the  line  suggesting  that 
different  sites  probably  truncate  higher  proportions  of  the  RNAs.  The  bottom 
dotted  line  is  consistent  with  read-through  transcripts18.  Thick  black  horizontal 
arrows  labeled  5'  UTR  probe,  ned  probe  and  intron  probe  reflect  genomic 
positions  of  the  strand-specific  probes  used  for  northern-blot  analysis. 


may  have  a  modest  effect  on  limiting  the  LINE- 1.3  expression  individu¬ 
ally  but  together  result  in  substantial  transcriptional  attenuation. 

Despite  limitations  in  detection  of  modest  changes  of  full-length 
LINE-1  RNA  levels,  mutants  1  and  5  both  caused  a  statistically  signifi¬ 
cant  increase  in  retrotransposition  rate,  by  a  factor  of  almost  2,  in  HeLa 
cells  (Supplementary  Table  3  and  Supplementary  Methods  online). 

Endogenous  LINE-1  RNAs  from  human  Ntera2  and  HeLa  cells  had 
a  similar  series  of  truncated  transcripts  (Fig.  2c).  We  also  compared 
the  band  distribution  in  transiently  transfected  and  nontransfected 
HeLa  cells  to  those  in  transfected  chicken  and  mouse  cells  (Fig.  2c). 
We  observed  similar  patterns  with  only  modest  differences,  suggest¬ 
ing  species-specific  or  cell  type-specific  variations  in  use  of  poly(A) 
sites.  Our  results  are  consistent  with  a  number  of  previous  observa¬ 
tions  that  both  human  and  mouse  cells  contain  a  number  of  smaller, 


heterogeneous,  LINE- 1 -related  bands6,12,24,25  of  unknown  origin, 
with  only  a  few  cell  types  containing  full-length  LINE-1  RNA6,12,25. 
Full-length  transcripts,  but  not  truncated  bands,  were  detected  in  cyto¬ 
plasmic,  poly(A)+-selected  RNAs  from  Ntera2  cells25.  Therefore,  we 
tested  cytoplasmic  versus  nuclear  poly(A)-selected  fractions  in  trans¬ 
fected  NIH3T3  cells  and  found  that  the  truncated  bands,  as  well  as  the 
full-length  transcripts,  were  almost  exclusively  nuclear  (Supplementary 
Fig.  2  online)  .This  is  consistent  with  observations  that  truncated  bands 
from  mouse  LINE- 1  are  primarily  in  the  nucleus12. 

We  found  that  most  of  the  transcripts  in  the  human  expressed- 
sequence  tag  (EST)  database  with  3'  end  sequencing  of  LINE-1  ele¬ 
ments  are  truncated  at  positions  predicted  by  the  internal  poly(  A)  sites 
(Fig.  3),  further  confirming  the  use  of  the  internal  poly(A)  sites  in  the 
endogenous  LINE-1  elements.  A  similar  analysis  of  mouse  EST 


Figure  2  Use  of  the  internal  poly(A)  sites  of  LINE-1  element,  (a)  Northern-blot  analysis  of  LINE-1.3  RNA  species  produced  by  the  wild-type  (WT)  human 
element,  an  element  with  a  mutation  in  the  strongest  predicted  poly(A)  site  (mutant  1;  1  Mut),  a  LINE-1.3  element  lacking  five  poiy(A)  sites  (mutant  5;  5 
Mut),  wild-type  LINE-1.3  element  without  the  ned  tag  (-  neo)  and  wild-type  mouse  LINE-lspa  (Llspa)  in  NIH  3T3  cells  probed  with  the  human  or  mouse 
strand-specific  5'  UTR  probe.  FL1.3  is  full-length  human  LINE-1.3  mRNA  (the  size  difference  of  FL1.3  in  -  neo  lane  is  due  to  the  absence  of  ned  in  the 
expression  cassette).  FLlspa  is  full-length  mouse  LINE-lspa  mRNA.  ned  is  neomycin -resistance  mRNA  detected  with  the  randomly  labeled  probe  to  exon  2 
of  the  ned  gene.  Long  horizontal  arrows  correspond  to  the  positions  of  the  molecular  weight  RNA  marker  (Invitrogen).  RNA  bands  that  are  described 
specifically  in  the  text  are  numbered,  (b)  The  high-molecular-weight  doublet  is  due  to  inefficient  splicing.  A  northern  blot  of  LINE-1.3  full-length  RNAs 
produced  by  the  wild-type  human  LINE-1  element  was  probed  with  a  strand-specific  5'  UTR  probe  and  probes  to  either  exon  2  (neoExon)  or  the  intron 
(neolnt)  of  ned.  Bands  corresponding  to  the  spliced  and  unspliced  mRNA  species  are  marked  FL1.3  (-)  intron  and  FL1.3  (+)  intron,  respectively. 

(c)  Northern  blots  of  endogenous  and  exogenous  LINE-1  RNAs  in  different  species.  A  northern  blot  of  poly(A)+-selected  total  LINE-1.3  RNA  species 
produced  by  the  endogenous  LINE-1  (EnFL1.3)  elements  in  Ntera2  and  HeLa  cells  and  wild-type  (WT)  and  mutant  1  (1  Mut)  human  LINE-1.3  elements 
transfected  into  HeLa  and  chicken  cells  was  probed  with  human  strand-specific  5'  UTR  probe.  FL1.3  is  full-length  human  LINE-1.3  mRNA.  ned  is 
neomycin-resistance  mRNA  detected  with  the  randomly  labeled  probe  to  exon  2  of  the  ned  gene  and  was  used  as  a  transfection  and  loading  control.  p-actin 
is  p-actin  mRNA  detected  by  randomly  labeled  probe  to  ACTB  and  was  used  as  loading  control  for  endogenous  mRNAs  in  Ntera2  cells. 
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Figure  3  3'  ends  of  ESTs  relative  to  common  LINE-1  poly(A)  signals,  Map 
positions  are  shown  for  the  strongest  predicted  canonical  and  noncanonical 
poly(A)  sites  in  the  LiNE-1.3  sequence.  The  asterisk  indicates  the  strongest 
poly(A)  site  in  our  transfection  experiments.  The  chart  below  shows  the 
relative  abundance  of  human  ESTs  whose  3'  ends  map  approximately  to  the 
locations  shown.  This  represents  140  ESTs  that  had  been  oligo-dT  primed 
and  sequenced  from  the  3'  end.  There  are  only  eight  full-length  LINE-1 
elements  in  this  group,  representing  less  than  6%  of  the  total  LINE-1- 
related  transcripts.  The  largest  cluster  resides  around  position  2,000,  in 
the  region  seen  near  our  strongest  predicted  poly(A)  site,  as  well  as  those 
shown  in  the  mutant  of  that  site,  and  the  positions  seen  in  the  northern  blot 
of  endogenous  LINE-1  elements  from  Ntera2  cells. 


(Supplementary  Fig.  3  online)  supported  the  conservation  of  the 
process  of  internal  polyadenylation  of  LINE-1  RNA  in  mammals. 
There  were  few  ESTs  corresponding  to  the  position  of  band  3  (Fig.  2a) 
for  human  LINE-1,  but  more  in  the  regions  of  bands  4  and  5  (Fig.  2a), 
consistent  with  our  observations  that  the  relative  use  of  the  internal 
poly(A)  sites  varied  in  different  cell  types.  Band  3  maybe  particularly 
strong  in  the  cultured  cells  used  in  these  studies,  whereas  other  sites 
maybe  preferred  in  the  endogenous  tissues  used  for  most  EST  studies. 

Chicken  fibroblasts  that  lack  any  endogenous  LINE-1  elements  but 
support  use  of  LINE- 1  poly(A)  signals  were  transfected  with  LINE- 1.3 
(Fig.  2c).  We  used  3'  RACE  to  amplify  the  region  of  the  primary  trun¬ 
cated  LINE-1  RNA  products  in  the  northern  blots.  This  analysis 
showed  a  pattern  consistent  with  the  northern  blot  results  for  the  wild 
type  and  mutant  1  (Fig.  4a).  There  were  also  multiple  larger  bands  in 
both  elements  that  corresponded  in  length  to  the  LINE- 1.3  region  con¬ 
taining  poly(A)  signals  responsible  for  the  band  4  in  the  northern-blot 
assay.  This  confirms  the  sites  of  the  functional  poly(A)  signals  and  that 
the  relative  activity  of  any  given  site  varies  depending  on  competition 
with  other  nearby  poly(A)  signals26.  Sequence  analysis  of  eight  clones 
from  the  strongest  3'  RACE  band  (Fig.  4a)  confirmed  the  use  of  that 
site.  Analysis  of  13  clones  from  the  2,053-2,153  region  of  the  3'  RACE 
gel  (Fig.  4a)  identified  7  clones  that  terminated  at  a  location  consistent 
with  the  use  of  the  ATTAAA  poly(A)  site  that  was  inactivated  in  mutant 
5  (Fig.  2a),  3  clones  that  terminated  at  a  location  consistent  with  the  use 
of  the  third  AATAAA  site  in  LINE-1  (Fig.  1),  which  was  inactivated  in 
mutant  3,  and  3  clones  that  used  various  noncanonical  sites  located  in 
that  general  region  of  LINE- 1.  All  polyadenylations  occurred  at  sites 
consistent  with  distance  and  with  the  sequence  normally  associated 
with  3'  cleavage26"28.  We  carried  out  3'  RACE  amplification  of 
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poly(A)+-selected  mRNAs  from  nontransfected  and  wild-type  LINE- 
1.3-transfected  Ntera2  cells  (Fig.  4b).  The  bands  were  similar  in  size  to 
those  identified  in  chicken  cells.  Band  1,788,  corresponding  to  the  posi¬ 
tion  of  the  strongest  poly(A)  signal,  was  not  detected  in  the  nontrans¬ 
fected  Ntera2  cells.  It  was  observed  on  transfection  with  the  LINE- 1.3 
expression  cassette,  however,  consistent  with  differences  between 
endogenous  elements  in  the  EST  data  (Fig.  3)  and  our  northern  blots  of 
LINE-1.3-transfected  cells  (Fig.  2). 

We  showed  that  LINE-1  sequences  support  the  use  of  a  broad 
range  of  poly  (A)  sites  that  limit  retroposition  activity  of  the  ele¬ 
ment,  resulting  in  less  retroposition  and,  therefore,  less  damage  to 
their  hosts’  genomes.  Despite  the  low  levels  of  full-length  LINE-1 
RNA,  LINE-1  retrotransposition  rates  can  be  quite  high,  suggest¬ 
ing  that  the  later  steps  of  LI  integration  maybe  very  efficient.  On 
the  other  hand,  the  internal  poly(A)  sites  may  increase  the  poten¬ 
tial  damage  when  LINE-1  elements  insert  into  a  gene,  causing  pre¬ 
mature  termination  of  that  gene’s  transcripts.  Because  of  the 
redundancy  in  the  system,  elimination  of  one  or  two  poly(A)  sites 
may  effect  only  minimally  the  overall  amplification  capability  of 
the  element.  Our  observation  of  a  new  application  of  the  conven¬ 
tional  mRNA  3'  end  formation  extends  the  range  of  the  known 
mechanisms  for  the  control  of  mRNA  levels  in  mammalian  cells 
and  provides  additional  understanding  of  the  selective  pressures 
that  act  on  new  retrotransposition  events  in  reshaping  the  mam¬ 
malian  genome. 


Figure  4  3'  RACE  analysis  of  the  prematurely 
terminated  LINE-1.3  RNA  species,  (a)  3'  RACE 
analysis  of  poly(A)-selected  total  mRNAs  from 
chicken  fibroblasts  transiently  transfected  with 
the  wild-type  (WT)  or  mutant  1  (1  Mut)  LiNE-1.3 
expression  vectors  described  in  Figure  1.  The 
upstream  primer  used  in  the  PCR  step 
corresponds  to  positions  1,342-1,359  of  the 
LINE-1.3  sense  strand.  M,  1-kb  DNA  ladder 
(NEBiolabs);  -RT,  without  reverse  transcriptase. 
The  band  corresponding  to  the  strongest 
predicted  poly(A)  site  identified  by  the  northern 
blot  is  labeled  1,788.  Genomic  location  and  the 
hexanucleotide  sequence  of  the  identified 
functional  poly(A)  sites  just  upstream  of 
individual  bands,  after  cloning  and  sequencing, 


are  shown  next  to  their  positions  on  the  gel.  (b)  3'  RACE  analysis  of  poly(A)-selected  total  mRNAs  from  chicken  fibroblasts  transiently  transfected  with  the 
wild-type  (WT)  or  mutant  1  (1  Mut)  LINE-1.3  expression  vectors  described  in  Figure  1  and  Ntera2  cells  nontransfected  (-WT)  and  transfected  with  the  wild- 
type  LINE-1.3  expression  cassette  (+WT).  The  upstream  primer  used  in  the  PCR  step  corresponds  to  positions  1,159-1,188  in  the  LINE-1.3  sense  strand. 
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METHODS 

Cell  culture.  We  maintained  NIH  3T3  (ATCC  CRL-1658),  HeLa  (ATCC  CCL2) 
and  Ntera2  (ATCC  CRL-1973)  cells  at  37  °C  and  5%  C02  in  Dulbecco’s  modi¬ 
fied  Eagle  medium  (GIBCO)  high  glucose,  10%  Colorado  calf  serum  (GIBCO); 
minimal  essential  medium  (GIBCO),  10%  fetal  bovine  serum  (GIBCO);  and 
Dulbecco’s  modified  Eagle  medium  high  glucose  (GIBCO),  15%  fetal  bovine 
serum,  respectively.  We  maintained  chicken  fibroblasts  (ATCC  CRL- 12203)  at 
39  °C  and  5%  C02  in  Dulbecco’s  modified  Eagle  medium  high  glucose 
(GIBCO)  and  15%  fetal  bovine  serum. 

Transient  transfection  assay.  We  transfected  4  X  106  NIH  3T3  cells  and  6  x 
106  HeLa  cells,  Ntera2  cells  or  chicken  fibroblasts  per  75  cm2  cell  culture 
flask  (Corning)  with  3-6  pg  of  the  LINE- 1.3  expression  cassette  by  lipofect- 
amine  (6-12  pi  of  Plus  reagent,  18-36  pi  of  lipofectamine;  Invitrogen) 
16-18  h  after  plating.  We  incubated  NIH  3T3  cells  with  the  transfection 
cocktail  in  the  serum-free  medium  for  4  h  and  incubated  chicken  fibrob¬ 
lasts,  HeLa  cells  and  Ntera2  cells  with  the  transfection  cocktail  in  the 
serum-free  medium  for  3  h.  We  incubated  all  cell  types  in  their  respective 
media  for  24-26  h  before  collecting  RNA. 

RNA  extraction  and  poly(A)  selection.  We  combined  the  contents  of  four  75- 
cm2  cell  culture  flasks  of  each  cell  type  and  extracted  total  mRNA  using  TRIzol 
Reagent  (Invitrogen).  We  then  carried  out  chloroform  extraction  and  iso¬ 
propanol  precipitation.  We  used  the  PolyATract  mRNA  isolation  system  III 
(Promega)  to  select  poly(A)  RNA  species  as  instructed  by  the  manufacturer.  We 
resuspended  poly( A) -selected  and  precipitated  RNA  in  30  pi  of  RNase-free 
water  and  fractionated  it  in  a  single  lane  of  an  agarose-formaldehyde  gel. 

Northern  blots.  We  transferred  RNA  to  a  Hybond-N  nylon  membrane 
(Amersham  Phannacia  Biotech)  by  capillary  transfer  overnight  at  room  tem¬ 
perature  in  a  standard  5x  sodium  chloride/sodium  citrate  (SSC)  solution.  We 
crosslinked  the  RNA  to  the  membrane  with  ultraviolet  light  and  prehybridized 
it  in  30%  formamide,  lx  Denhardt’s  solution,  1%  SDS,  1  M  NaCl,  100  pg  ml-1 
salmon  sperm  DNA  and  100  pg  ml-1  yeast  tRNA  at  60  °C  for  at  least  6  h. 
Hybridization  with  a  strand-specific  probe  (final  concentration  of  4-8  x  106 
cpm  ml'1)  was  carried  out  overnight  in  the  same  solution  at  60  °C.  We  carried 
out  multiple  10-min  washes  at  high  stringency  (O.lx  SSC,  0.1%  SDS)  at  60  °C. 
We  quantified  the  results  of  the  northern-blot  assays  on  a  Fuji  Phosphorimager. 

We  generated  the  strand-specific  probe  used  for  the  northern-blot  assay  by 
the  MAXIscript  T7  system  (Ambion).  Primer  sequences  for  generating  the  tem¬ 
plate  are  available  on  request.  We  produced  DNA  template  for  the  probe  by 
PCR  with  primers  that  amplified  either  the  LINE- 1.3  5'  UTR  or  exon  2  or  the 
intron  of  the  neor  cassette.  The  T7  promoter  sequence  was  included  in  the 
reverse  primer  of  each  pair.  We  fractionated  PCR  products  on  a  1%  tow-melt¬ 
ing  agarose  gel,  excised  them  and  purified  them  using  a  QIAquick  gel  extraction 
kit  50  (QIAGEN). 

3'  RACE  analysis.  We  carried  out  3'  RACE  (Clontech)  according  to  the  manu¬ 
facturer’s  protocol  on  poly(A) -selected  total  mRNA  from  chicken  fibroblasts 
and  Ntera2  cells  transiently  transfected  with  LINE- 1.3  expression  cassette  as 
described  above.  Primer  sequences  are  available  on  request.  We  gel-purified 
DNA  with  QIAquick  Gel  Extraction  Kit  (QIAGEN),  cloned  it  into  pCR2.1- 
TOPO  vector  (Invitrogen)  and  sequenced  3-8  clones  for  each  poly(A)  site  by 
automated  sequencing. 

Site-directed  mutagenesis.  We  used  the  QuikChange  Site-Directed 
Mutagenesis  kit  (STRATAGENE)  to  change  the  AATAAA  hexamer  sequence  of 
the  strongest  predicted  poly(A)  site  in  the  LINE- 1.3  genome  (position  1,788) 
as  well  as  the  poly(A)  sites  at  positions  1,323  and  2,154;  the  poly(A)  sites  at 
position  1,323  and  2,154  to  GATCAA;  the  ATTAAA  hexamer  sequence  of  the 
poly(A)  signal  at  position  2,053  to  ATCAAG;  and  the  GATAAA  hexamer 
sequence  of  the  poly(A)  signal  at  position  2,079  to  GATCAA. 

EST  searches.  We  identified  LINE-1  ESTs  using  a  BLAST  search  of  the  entire 
human  EST  database  with  the  LINE- 1.3  sequence.  We  collected  the  first  1,085 
hits,  but  only  140  corresponded  to  3'  sequences.  Those  sequenced  from  the  5' 
end  would  not  necessarily  define  the  poly(A)  site.  For  Figure  3,  the  3'  positions 
of  the  ESTs  were  binned  into  50-base  regions. 


Accession  numbers.  LINE-1.3,  L19088;  LINE-IGg-la,  AF036235;  LINE-lspa, 
AF0 16099. 

Note:  Supplementary  information  is  available  on  the  Nature  Genetics  website. 
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